Skip to content

Add parallel-task-set crate, test it, use it #8174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 19, 2025
Merged

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented May 15, 2025

Follow-up from support bundles work.

This crate exposes a "JoinSet"-like interface which also has a bound on maximum parallelism.

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, with a couple of nits. there are some places in Nexus we could use this too!

@hawkw
Copy link
Member

hawkw commented May 16, 2025

Test failure in build-and-test (helios) looks like a flake (in one of my tests, agh).

@smklein smklein merged commit b6eff04 into main May 19, 2025
18 checks passed
@smklein smklein deleted the parallel-task-set branch May 19, 2025 15:30
hawkw added a commit that referenced this pull request May 20, 2025
Presently, several Nexus background tasks use `tokio::task::JoinSet` to
run a large number of Tokio tasks in parallel. In these tasks, we
typically set a concurrency limit on the number of spawned tasks using
the size of a database query that's used to determine the tasks that
should be spawned. We perform the query with a small page size, spawn a
group of tasks, and wait for them to complete, in a loop, until the
query returns no records.

While this is simple to implement, it's not the ideal way to do this, as
it will unnecessarily limit the throughput of the spawned tasks. This is
because this pattern does not ensure that *exactly* `$CONCURRENCY_LIMIT`
tasks are running at a given time, it ensures that *up to*
`$CONCURRENCY_LIMIT` tasks are running. Since the database is not
queried again to spawn a new batch of tasks until after the *entire*
batch of tasks complete, there will always be some period of time during
which only a single task is running and all the others have completed.
If there's a relatively large variation in how long those tasks take to
complete, one slow task can potentially prevent any others from starting
for a longish period of time. An alternative approach, where the tasks
are spawned all at once but made to wait on a `tokio::sync::Semaphore`
before they actually begin executing, allows us to maximize throughput
while limiting concurrency. In this approach, a new task will begin
executing immediately as soon as another task finishes, so there are
always exactly `$CONCURRENCY_LIMIT` tasks running until the final batch
of tasks begins to complete.

The `parallel-task-set` crate added in PR #8174 implements a reusable
abstraction for this, so this branch updates the `instance_watcher` and
`webhook_deliverator` background tasks to use it. Furthermore, @smklein
and I spent some time tweaking the `ParallelTaskSet` API to make it
easier to limit not only the number of tasks _executing_ in parallel,
but also the number of tasks _resident in memory_ at any given time, by
changing the `spawn` method to wait for a previous task to complete
if the set is already at the limit.

Note that I did *not* change the `instance_updater` background task to
use `ParallelTaskSet` in this manner, as all it does is run
`instance_update` sagas. Unlike the `instance_updater` and
`webhook_deliverator` background tasks, which make HTTP requests to
sled-agents and external webhook endpoints, respectively, this task just
spawns sagas and waits for them to finish. So, its spawned tasks aren't
actually doing any _work_ besides waiting on a `RunningSaga` future to
complete, and the actual work is performed in the saga executor.
Concurrency limiting the actual work would require the concurrency limit
to be implemented in the saga executor, and not the background task.
Also, it's important that all the sagas be _started_ as soon as
possible, even if the current nexus does not execute them, so that they
may be picked up by other Nexii.

Similarly, the `instance_reincarnation` task also performs a
query-spawn-batch-wait type loop, but in that case, it's necessary as
the sagas started for each instance in the query performs the state
change that evicts it from a subsequent query. Therefore ,that task
_must_ wait for all sagas in the batch to complete before proceeding.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants